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Abstract: 

Agriculture is the backbone of the 
Indian economy and a source of 
employment for millions of people across 
the globe. Agricultural aspects and 
parameters are employed to supply 
information that may be used to research 
greater approximately. Agricultural facts, 
Crop forecast, rotation, water 
requirements, fertilizer requirements and 
safety problems may be resolved. Due to 
the varying climatic conditions in the 
environment, a green technique of selling 
crop cultivation and assisting farmers 
with their manufacture and management 
is essential. Tamil Nadu, being a coastal 
state, has agricultural unpredictability, 
which reduces productivity. Increasing 
production should be possible with more 
people and land area, however it cannot 
be attained. Machine Learning 
Techniques use data to create a well- 
defined model that assists us in making 
predictions. Crop forecast, rotation, water 
requirements, fertilizer requirements and 
crop protection area challenges that may 
be resolved.Because of the 
environment's changeable climatic 
elements, an effective approach to aid 
crop cultivation and assist farmers in 
their production and management is 
required. This might help aspiring 
farmers improve their farming practices. 
With the use of data mining, a farmer may 
be presented with a system of 
suggestions to assist the min crop 
production. 
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Crops recommended for 
implementation based on climatic 
parameters and quantity. Data analytics lays 
the path for the development of valuable 
extractions from agricultural databases. 
Crop data has been evaluated and crop 
suggestions have been made based on 
productivity and season. We have used 
different datasets on these models to get a 
better accuracy. 

Keywords- Data set, K Nearest Neighbor 
(KNN), Data Preprocessing, Multiple Linear 
Regressions (MLR). 


are 


I. INTRODUCTION 

Tamil Nadu, India's seventh-largest state, has 
the sixth-largest population. It is the world's 
largest producer of agricultural products. Tamil 
Nadu has major source of income is agriculture. 
Agriculture has а positive tone in this 
hypothetical planet. The Cauvery River is the 
primary source of water. The Cauvery delta 
areas are known as Tamil Nadu rice bowl. The 
main crop farmed in Tamil Nadu is rice. Other 
crops planted include paddy, sugarcane, cotton, 
coconut апа peanuts.  Bio-fertilizers аге 
effectively manufactured. Several places 
farming is the most common source of income. 
Agriculture has a significant influence on a 
country's economy. Agriculture cultivation is 
deteriorating due to changes in natural 
elements. Agriculture is directly affected by 
environmental elements such as sunshine, 
humidity, soil type, rainfall, maximum and 
minimum temperatures, climate, fertilizers, 
pesticides and so on. Knowledge of correct 
harvesting of crops is in demand to flourish in 
Agriculture. 
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Farmers confront substantial challenges 
such as crop management, predicted crop yield 
and crop productive output. Farmers or 
cultivators want adequate crop cultivation 
assistances incemany young people are 
interested in agriculture these days. The 
impact of the IT industry on analyzing real-world 
problems is increasing. Data in the agricultural 
industry is growing by the day. 


Il. LITERATURE REVIEW 


2.1 DATA MINING AND WIRELESS SENSOR 
NETWORK FOR AGRICULTURE 
PEST/DISEASE PREDICTIONS 

In this study, A. K. Tripathy et al. claim 
that data-driven precision agricultural features, 
notably pest/disease control, need dynamic 
crop-weather data. An experiment was carried 
out in a semiarid environment to better 
understand crop-weather-pest/disease 
relationships by utilizing wireless sensory and 
field-level surveillance data on the closely 
connected and interdependent pest (Thrips) - 
disease (Bud Necrosis) dynamics of groundnut 
crop. Data mining techniques were utilized to 
transform the data into valuable information/ 
knowledge/relationships/trends, as well as the 
linkage of the crop-weather-pest/disease 
continuum. These dynamics, derived from data 
mining approaches апа taught using 
mathematical models, were validated using 
surveillance data. Data from the  Kharif 
(monsoon) and Rabi (post-monsoon) seasons 
might be utilized to construct areal-time to 
near-real-time decision support svstem for 
pest/disease forecasts. 


2.2 ANALVSING SOIL DATA USING DATA 

MINING CLASSIFICATION TECHNIQUES 
In this work, V. Rajeswar et al, suggest 
that soil is an important vital component іп 
agriculture. The approach aims to forecast soil 
type using data mining classification algorithms. 
Methods/Analysis: Data mining classification 
algorithms such as JRip, J48, and Naive Bayes 
are used to forecast soil type. These classifier 
algorithms are used to extract knowledge from 
soil data, and two categories of soil are taken 
into account: red and black soil. Findings: This 
study summarizes Data Mining and agricultural 
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Data Mining. The JRip model can give more 
accurate findings from this data, and the 
forecastis Карра Statistics have been 
enhanced. Application/Improvement: To 
address Big Data challenges, effective solutions 
that use Data Mining to improve the accuracy of 
categorization of large soil data sets may be 
developed. 


2.3 THE IMPACT OF DATA ANALYTICS IN 
CROP MANAGEMENT BASED ON 
WEATHER CONDITIONS 

Agriculture, according to A.Swarupa Rani et 
al., is the most important application field, 
particularly in developing nations like India. 
Data mining is critical for making decisions on a 
variety of agricultural challenges. The purpose 
of data mining is to extract information from 
current data sets and turn it into a unique 
human-readable format for future usage. Crop 
management in a certain agricultural region is 
dependent оп ће region's climatic 
circumstances since climate has a large 
influence on 
crop yield. Real-time weather data can assist in 
optimal crop management. The use of 
information and communications technology 
allows for the automation of extracting 
significant data in an effort to obtain knowledge 
and trends, allowing for the elimination of 
manual tasks and easier data extraction directly 
from electronic sources, transfer to a secure 
electronic system of documentation, and 
reduction of production costs, higher yield, and 
higher market price. It was also discovered how 
data mining may be used to assess and 
anticipate beneficial patterns from massive 
amounts of constantly changing climate data. 


2.4 SPIKING NEURAL NETWORKS FOR 
CROP YIELD ESTIMATION BASEDON 
SPATIO TEMPORAL ANALYSIS OF 

IMAGE TIME SERIES 

In this study, Pritam Bose et al. suggest This 
research introduces spiking neural networks 
(SNNs) for distant sensing spatio temporal 
analysis of picture time series that take use of 
extremely parallel and low-power neuromorphic 
hardware platforms. The creation of the first 
SNN computational model for crop yield 
estimate using normalized difference vegetation 
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index image time series in this study 
exemplifies this approach. It describes the 
construction and testing of a methodological 
framework that uses the spatial accumulation of 
time series of Moderate Resolution Imaging 
spectro radiometer 250-m resolution data and 
historical crop yield data to train an SNN to 
forecast crop yield in real time. There search 
also includes an examination of the optimal 
amount of characteristics required to maximize 
the out comes from our experimental data set. 

The suggested method was used to estimate 

the production of winter wheat 

(Triticumaestivum. L) in Shandong province, 

one of China's primary winter wheat-growing 

regions. 
Ill. DATASET 
Precision agriculture is in trend 
nowadays. It helps the farmers to get informed 
decision about the farming strategy. For the 
system, we are using various datasets all 
downloaded for government website апа 

Kaggle. 

1) Datasets Include: Cost of cultivation per 
dataset for major crops in each state 
Yield dataset, A brief description of the 
datasets. 

2) Yield Dataset This dataset contains 
yield for 16 major crops grown across all 
the states in kg per hectare. Yield of 0 
indicates that the crop is not cultivated 
in the respective state. 

3) Data Preprocessing: This step includes 
replacing the null and 0 values for yield 
by -1 so that it does not effect the 
overall prediction. Further we had to 
encode the dataset so that it could be 
fed into the neural network. 


: crop_df=crop_df.dropna().reset_index(drop=True) 
crop df 


State Name District Мате Crop Year Season Crop _ Area Production 


0 Andaman and Nicobar Islands NICOBARS 2000 Kharif Arecanut 12540 20000 
1 Andaman and Nicobar Islands NICOBARS 2000 Kharif Other Kharif pulses 20 

2 Andaman and Nicobar Islands NICOBARS 2000 Kharif Rice 1020 3210 
3 Andaman and Nicobar Islands МІСОВАКЅ 2000 Whole Year Banana 176.0 6410 


4 Andaman and Nicobar Islands NICOBARS 2000 Whole Year Cashewnut 720.0 1650 


242356 West Bengal PURULIA 2014 Summer Rice 306.0 

242357 West Bengal PURULIA 2014 Summer Sesamum 6270 6. 
242358 West Bengal PURULIA 2014 Whole Year Sugarcane 324.0 162500 
242359 West Bengal PURULIA 2014 Winter Rice 2791510 5978990 
242360 West Bengal PURULIA 2014 Winter Sesamum 1750 880 


242361 rows x 7 columns 


Fig: Data Set Collection 
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IV. SYSTEM ARCHITECTURE 

Farmers confront substantial challenge 
such as crop management, predicted crop 
yield, and crop productive output. Farmers or 
cultivators want adequate crop cultivation 
assistances incemany young people аге 
interested in agriculture these days. The 
impact of the IT industry on analyzing real- 
world problems is increasing. Data in the 
agricultural industry is growing by the day. 
With the growth of the Internet of Things, 
there are ways to capture massive amounts of 
data in the sector of agriculture. There is a 
need for a system that can clearly assess 
agricultural data and extract or use important 
information from the spreading data. It is 
necessary to understand and how to extract 
insights from data. 

Extensive work has been done, and 
many ML algorithms have been applied in the 
agriculture sector. 

The biggest challenge in agriculture is to 
increase farm production and offer it to the 
end-user with the best possible price and 
quality. It is also observed that at least 5096 of 
the farm produce gets wasted, and it never 
reaches the end-user. The proposed model 
suggests the methods for minimizing farm 
produce wastage. One of the recent works, S. 
Pavani et.al. presented a model where the 
crop yield is predicted using KNN algorithms 
by making the clusters. It has been shown that 
KNN clustering proved much better than SVM 
or regression. 
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Fig: Svstem Architecture 
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V. PROPOSED SYSTEM 

The system prepared predict major 
crops yield in a particular district in Tamil Nadu. 
The client on their first login has to register 
themselves on the Web application created by 
flask. The login details are stored in SQLite 
database. Once the user login into the system 
they gets all the access for predicting crop yield 
and using the input such as location, nitrogen, 
phosphorous, potassium and pH values 
depends on their forming land environment. We 
can also find the primary nutrients of soil by 
given the input as crop name. It passes the 
various inputs to the controller which uses the 
Random Forest for classification. 

We recommend to the former how much 
fertilizer required in ratio based оп soil 
parameters and the crop price using machine 
learning techniques. Machine Learning (ML) 
approaches are currently being employed in a 
variety of disciplines to provide practical and 
effective solutions. To forecast agricultural 
yield, multiple ML methods based on 
classification, clustering, and neural networks 
can be utilized. In this work, we propose a 
method based on K-Nearest Neighbors (KNN) 
algorithm which detects the weather quality and 
predicts the suitable crop for cultivation. 


VI. NEURAL NETWORK 

A neural network is a set of algorithms 
that attempts to recognize underlying 
relationships in a batch of data using a method 
that mimics how the human brain works. 
Because neural networks can modify input, they 
can produce the best possible outcome without 
requiring the output criteria to be redesigned. 

A neural network is analogous to the 
neural network in the human brain. In a neural 
network, a "neuron" is a mathematical function 
that collects and categorizes data using a 
specified design. The two statistical procedures 
that the network closely resembles are the 
curve fitting and regression analysis. 
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Before K-NN 


. . . 
Categorv B Categorv B 


. 
New data point 


Fig: KNN Algorithm 

Our system uses crop and 
meteorological data as inputs. In addition, 
our method suggests the fertilizer based on 
the crop predicted. 

We have used popular algorithms: 
Linear regression, Logistic regression and 
Neural network and KNN. All the algorithms 
are based on supervised learning. Our 
overall system is divided into three 
modules: 

Advantages of Proposed Svstem 

1) The proposed model predicts the crop 
vield for the data sets of the given region. 
Integrating agriculture and ML will 
contribute to more enhancements in the 
agriculture sector by increasing the yields 
and optimizing the resources involved. The 
data from previous years are the key 


New data point 
assigned to 
Category 1 

Category А 


elements in forecasting current 
performance. 
2) The proposed system uses 


recommender system to suggest the right 
time for using fertilizers. 

3) The methods in the proposed system 
includes increasing the yield of crops, real- 
time analysis of crops, selecting efficient 
parameters, making smarter decisions and 
getting better yield. 

The test results show that our method 
accurately predicts the crop selection and 
yield which helps the farmers to great 
extent. 


Vil. CONCLUSION AND RESULTS 

In this project, we applied KNN models 
of the algorithm of machine learning to the 
project and after applying the algorithm on 
the dataset we get the accuracy of 65.05%. 
Now weare applying the project on various 
algorithm such as ANN (artificial neural 
network), SVM (support vector machine). To 
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increase the efficiency of the project and 
accuracy in this project we take the datasets 
from the various government websites such 
as - https://data.gov.in/ and KAGGLE and 
apply various parameters and algorithm's to 
get the maximum accuracy. The maximum 
accuracy we attain after applying algorithm's 
is 65.05%. This is the accuracy we achieved 
after applying the KNN algorithm. 


50% accurate 


90 accurate ММ accurate 


160 | 74 | 70 | 10 | 19 | ДА; 


Fig: Weather Forecast 


CROP PREDICTION 


CROP PREDICTION 


Prediction is 


Fig: Crop Recommender System 
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Output for Production analysis 


)j; state namesinput ("Enter State lanes 7) 


State code = getStateCode(state name) 
validateCode( state code) 


district nanesinput( “Enter District Name: ") 
district code = getDistrictCode(district_nane) 
validatecode(district code) 


crop years 2022 afloat (Input ("Enter Crop Year; ")) 


season nanesinput( Enter Season: 7) 
Season (оде = petsessjonċode( season nate) 
validateCode(season code) 


rop nanesinput("Enter Crop: ") 

«rop собе = getCropCode(crop nane) 
validateCode(crop code) 

crop area«f oat (input ("Enter Area; ")) 


resultemodel predict ([[state code, district code, crop year, season code, crop code,crop area]) 
print (Production: "result [8], МММММММ) 


Enter State Nase: Uttar Pradesh 
Enter District Nane: BIJNOR 
Enter Season: Rabi 

Enter Crop; Potato 

Enter Area: 861 

Production: 2487 


Fig: Output for Production analysis 


Vill. FUTURE WORK 

1) The number of additional and other 
features can we added to the system. 

2) At now currently, it take a necessary 
datasets as input from various 
government sites and KAGGLE and 
indicate a very appropriate crop to be 
cultivated. 

3) But as in future, the automation property 
is added to the system as the response 
given to the feedback. 

4) This can be updated to give the result 
with according to the humidity, water 
levels and temperature in the 
surrounding. 

5) This can be updated such as that it will 
suggest the crop that give high 
production in that area and the crop will 
not harm the soil fertility and the 
environment due to some of it's 
chemical components. 
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